What did you Mention? A Large Scale Mention Detection Benchmark for Spoken and Written Text

نویسندگان

  • Yosi Mass
  • Lili Kotlerman
  • Shachar Mirkin
  • Elad Venezian
  • Gera Witzling
  • Noam Slonim
چکیده

We describe a large, high-quality benchmark for the evaluation of Mention Detection tools. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. The benchmark was built through a highly controlled crowd sourcing process to ensure its quality. We describe the benchmark, the process and the guidelines that were used to build it. We then demonstrate the results of a state-of-the-art system running

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Company Mention Detection for Large Scale Text Mining

Text mining on a large scale that addresses actionable prediction needs to content with noisy information in documents, and with interdependencies between the kinds of NLP techniques applied and the data representation of instances. This paper presents an initial investigation of the impact of improved company mention detection for financial analytics. Coverage of company mention detection impr...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Stephen Hawking's Community-Bound Voice A Functional Investigation of Self-Mentions in Stephen Hawking's Scientific Prose

Thanks to the development of the concept of metadiscourse, it is now widely acknowledged that academic/scientific writing is not only concerned with communicating purely propositional meanings: what is communicated through academic/scientific communication is seen to be intertwined with the negotiation of social and interpersonal meanings. While a large number of so called metadiscoursal resour...

متن کامل

Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding

Integrating text and knowledge into a unified semantic space has attracted significant research interests recently. However, the ambiguity in the common space remains a challenge, namely that the same mention phrase usually refers to various entities. In this paper, to deal with the ambiguity of entity mentions, we propose a novel Multi-Prototype Mention Embedding model, which learns multiple s...

متن کامل

Bridging Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding

Integrating text and knowledge into a unified semantic space has attracted significant research interests recently. However, the ambiguity in the common space remains a challenge, namely that the same mention phrase usually refers to various entities. In this paper, to deal with the ambiguity of entity mentions, we propose a novel Multi-Prototype Mention Embedding model, which learns multiple s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1801.07507  شماره 

صفحات  -

تاریخ انتشار 2018